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Abstract 

Background: The coronavirus 3 chymotrypsin-like protease (3CL pro ) is a validated target in the design of potential 
anticoronavirus inhibitors. The high degree of homology within the protease's active site and substrate conservation 
supports the identification of broad spectrum lead compounds. A previous study identified the compound ML188, 
also termed 16R, as an inhibitor of the Severe Acute Respiratory Syndrome coronavirus (SARS-CoV) 3CL pro . This study 
will detail the generation of a homology model of the 3CL pro of the human coronavirus OC43 and determine the 
potential of 16R to form a broad-spectrum lead compound. MODELLER was used to generate a suitable 
three-dimensional model of the OC43 3CL pro and the Prime module of Schrodinger predicted the binding 
conformation and free energy of binding of 16R within the 3CL pro active site. Molecular dynamics further 
confirmed ligand stability and hydrogen bonding networks. 

Results: A high quality homology model of the OC43 3CL pro was successfully generated in an active conformation. 
Lurther studies reproduced the binding pose of 16R within the active site of the generated model, where its free 
energy of binding was shown to equal that of the 3CL pro of SARS-CoV, a receptor it is experimentally proven to inhibit. 
The stability of the ligand was subsequently confirmed by molecular dynamics. 

Conclusion: The lead compound 16R may represent a broad-spectrum inhibitor of the 3CL pro of OC43 and potentially 
other coronaviruses. This study provides an atomistic structure of the 3CL pro of OC43 and supports further experimental 
validation of the inhibitory effects of 16R. These findings further confirm that the 3CL pro of coronaviruses can be 
inhibited by broad spectrum lead compounds. 

Keywords: Human coronavirus, OC43, 3CL pro , Homology modelling, Molecular dynamics 

V _____,_______ J 


Background 

Human coronaviruses have a worldwide distribution and 
are frequently associated with self-limiting upper re¬ 
spiratory tract disease or “the common cold”. They can, 
however, also present with high morbidity outcomes of 
the lower respiratory tract including bronchiolitis, pneu¬ 
monia, [1-3], asthmatic exacerbations [4] and acute ex¬ 
acerbations of chronic obstructive pulmonary disease 
(COPD) [5], where human coronavirus OC43 has been 
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shown to be the prevalent strain in several regions [6]. 
Given the high burden of coronaviruses to human 
health, and their potential for genetic recombination to 
give rise to the emergence of completely novel strains, 
the presence of antiviral strategies is paramount. How¬ 
ever, there is currently no commercially available mo¬ 
lecular entity which is capable of inhibiting infection 
with human coronaviruses [7]. The 3CL pro has been vali¬ 
dated as an effective drug target in several studies and 
has even been termed “the Achilles' heel of corona¬ 
viruses” [8] making it an ideal target for the identifica¬ 
tion of novel lead compounds. 
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During coronavirus replication the large open reading 
frame (ORF) la/lab genes, located at the 5’ end of the 
genome, are responsible for expressing two large replicase 
polyproteins (pp). These are co- or post-translationally 
cleaved by the virally encoded 3CL pro and Papain-like pro¬ 
tease to yield 16 non-structural proteins responsible for 
viral replication [9]. The 3CL pro is also termed the main 
protease as it cleaves a total of 11 cleavage sites within 
ppla and pplab [10], in comparison to only three cleavage 
sites predicted for the papain-like protease [11]. 

The 3CL pro has three distinct domains. Domains I and 
II are largely composed of several anti-parallel (3 barrels 
[12] and is connected to domain III by a long loop re¬ 
gion. Domain III is composed of several globular a- 
helices and plays a role in protein dimerization, along 
with the N-terminal N-finger (residues 1-7). The active 
site is located in a chymotrypsin-like fold between do¬ 
mains I and II and contains a catalytic dyad of His41, 
which acts as a proton acceptor, and Cys 144/5, which 
undergoes nucleophilic attack on the carbonyl carbon of 
the substrate [8,13]. The 3CL pro is so named for its close 
structural and sequence homology to the 3 chymotryp- 
sin protease (3C pro ) of rhinoviruses, which contains a 
catalytic triad composed of His, Cys and Glu or Asp. 
Superimposition of the structures of 3CL pro and 3C pro 
shows that the His and Cys of both proteases is almost 
perfectly aligned, which may explain the similar sub¬ 
strate specificity and catalytic mechanism [14]. The pos¬ 
ition of the Glu/Asp in 3C pro is, however, replaced by a 
water molecule in 3CL pro , which shares three hydrogen 
bonds with surrounding residues including the catalytic 
dyad member His41. It is assumed that this water mol¬ 
ecule is responsible for stabilization of the histidine in 
the intermediate state during enzymatic cleavage [8]. 

All coronavirus 3CL pro share a high sequence hom¬ 
ology, as well as main chain architecture and substrate 
conservation [15,16]. The substrate binding site of the 
3CL pro has two deeply buried S 3 and S 2 subsites, with 
shallow Sy S 3 and S 4 subsites with varying degrees of 
solvent exposure. Substrate specificity of coronavirus 
3CL pro is mainly determined by the Pi, P 2 and Pr posi¬ 
tions [16]. The P 3 position has an absolute specificity for 
glutamine which stabilizes the Si subsite via a hydrogen 
bond with the imidazole Ne2 of Hisl62/3 and van der 
Waals interactions with surrounding residues of the Si 
pocket. The P 2 site has a preference for leucine or me¬ 
thionine to fill the hydrophobic S 2 pocket. The side- 
chains of the S 3 site are solvent exposed and therefore 
this site is expected to tolerate a wide range of function¬ 
ality, but shows a preference for basic residues [17]. 
Sidechains and backbones of residues surrounding the 
S 4 site create a highly congested pocket which favors a 
small, hydrophobic residue in the P 4 position, either Ser, 
Thr, Val or Pro [17-19]. The Si> and S 2 > subsites also 


accommodate small residues in the P x > and P 2 > positions, 
which may include Ser, Ala or Gly [18,20]. A typical 
cleavage recognition site is therefore (Ser, Ala)-(Val, 
Thr)-Leu-Glu [ (Ser, Ala, Gly), which is conserved 
among all coronavirus 3CL pro [21]. Theoretically, since 
these viruses share a highly homologous binding site 
and substrate conservation, it is reasonable to antici¬ 
pate that broad spectrum inhibitor strategies might be 
successful. [13]. 

A noncovalent, furyl amide ligand, 16R, which was 
identified by a high throughput screen, with subsequent 
lead optimization based on structure-activity relation¬ 
ships within the S 3 , Si> and S 2 binding pockets was 
shown to inhibit the 3CL pro of SARS-CoV at an IC50 of 
1.5 ± 0.3 pM [12]. This study will detail the generation of 
a theoretical model of the OC43 3CL pro based on a high 
sequence homology with the solved 3CL pro structure of 
HKU1 and assess the potential of the inhibitor, 16R, to 
inhibit this complex in a broad-spectrum manner. This 
study will therefore provide further evidence to support 
the identification of broad spectrum 3CL pro inhibitors. 

Results and discussion 

Homology modelling of the OC43 3CL pro 

The 3CL pro structure of HKU1 [PDB: 3D23] displayed a 
high sequence identity of 82.3% to the 3CL pro of OC43, 
with an e-value of 0.0. The high degree of identity can 
be partly expected as both OC43 and HKU1 are human 
coronaviruses from the Betacoronavirus genus. The ex¬ 
ceptionally high degree of identity may even further sug¬ 
gest a recent common ancestor, which has yet to be 
identified. The active site residues are also highly con¬ 
served between both sequences indicating that 3D23 
forms a highly suitable template for model generation 
(Figure 1). 

Homology models were built with MODELLER (9vl0) 
[22,23] where the lowest discrete optimized protein en¬ 
ergy (DOPE) score corresponded to model five with a 
GA341 score of 1, indicating that the model quality cor¬ 
responded with low resolution crystallographic struc¬ 
tures. The DOPE score profile of target and template 
(Figure 2) were nearly perfectly overlaid, indicating that 
the model was close to its native state. A peak in DOPE 
score for HKU1 3CL pro (3D23) was observed at approxi¬ 
mately residue 50, where OC43 3CL pro showed a moder¬ 
ate conservation in DOPE score. Colouring the HKU1 
3CL pro (3D23) structure by B-factor indicates the pres¬ 
ence of a highly variable loop region from Ser46 to 
Asp53 (Figure 3). The presence of this highly variable 
loop structure could explain the increase in the DOPE 
score profile in this region and may suggest that the 
homology model has assumed a more stable conform¬ 
ation than the template. Structural alignments where 
the root mean square deviation (RMSD) is below 2 A 


Berry et at. BMC Structural Biology (2015) 15:8 


Page 3 of 10 


Position 


10 


20 


30 


40 


50 


3D23 SSGIVKMVSPTSKIEPCIVSVTYGSMTLNGLWLDDKVYCPI 

OC43.3CL -SGIVKMVNPTSKVEPCWSVTYGNMTLNGLWLDDKVYCPf 


!\ CSSSNM JEPDYSALLCRVTLGDFT 


/: CSASDM 


60 


NPDYTNLLCRVTSSDFT 


Conserved 

******* 

* * * * 

*** ****** 

*************** *|*_«S** 

* * *** 

****** 

* * * 

Position 

70 

80 

90 

100 110 

120 

130 



3D23 IMSGRMSLTWSYQMQGCQLVLTVSLQNPYTPKYTFGNVKPGETFTVLAAYNGRPQGAFHVTMRSSYT 
0C43.3CL VLFDRLSLTVMSYQMRGCMLVLTVTLQNSRTPKYTFGWKPGETFTVLAAYNGKPQGAFHVTMRSSYT 
Conserved 


150 


160 


Position 140 
3D23 IKGS : LCGSjCftSVGYVLTGDSVKFVYI|HQLE 

OC43.3CL IKGSFLCGS O iSVGYVIMGDCVKFVYfHQLE 
Conserved ****h***4fcJ c ***** ** *****’ 


170 


180 


-STGCHTGTDFTGNFYGPYI :DA )WQLPVKDYVQTVN 


.STGCHTGTDFNGDFYGPYH 


12 $ 


200 


DA )WQLPIQDYIQSVN 

♦ afr c 3+c 3+c 3fc 3+c af: + 3#c jfc + sf: 


Position 

3D23 
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VIAWLYAAILNNCAWFVQNDVCSTEDFNVWAMANGFSQVKADLVLDALASMTGVSIETLLAAIKRLYM 
FLAWLYAAILNNCNWFIQSDKCSVEDFNVWALSNGFSQVKSDLVIDALASMTGVSLETLLAAIKRLKN 


Position 280 290 300 

3D23 GFQGRQILGSCTFEDELAPSDVYQQLAGV- 

OC43.3CL GFQGRQIMGSCSFEDELTPSDVYQQLAGIKLQ 
Conserved ******* *** ***** ********** 

Figure 1 Pairwise sequence alignment of OC43 3CL pro with the template structure of 3D23. Sequence alignment revealed a high identity of 
82.3%. Asterisks indicate conserved residues between target and template. Conserved active site residues are highlighted in red. Important 
residues within the oxyanion loop (yellow), S] pocket (blue) and S2 pocket (black) are also highlighted to display high degree of conservation 
within the active site. 
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between target and template indicates that the posi¬ 
tions of all backbone elements are correct [24,25]. 
Superimposition of the 3D23 template and modelled 
OC43 3CL pro structure displayed an RMSD of 0.327 A 
suggesting a highly accurate prediction of the position 
of all backbone elements (Figure 4). Analysis of the 
overall model quality of target and template by ProSA 
Z-score indicated that both fall within an acceptable 


range for crystallographic structures with a Z-score for 
3D23 of -7.04 and -7.34 for the homology model of 
OC43 3CL pro (Figure 5). Stereochemical analysis of 
phi-psi dihedral angles indicated that 91.8% of residues 
were in the most favoured regions with none in the dis¬ 
allowed regions (Figure 6). 

For the latter purposes of this model, in characterization 
of the inhibitory potential of 16R, it is essential to confirm 


/ \ 



Figure 2 DOPE score profiles of template, 3D23, and homology model of OC43 3CL pro . General overlay of profiles indicates the generated model 
is close to its native structure. The spike at residue 50 corresponds to a variable loop structure for which OC43 3CL pro has assumed a more 
stable conformation. 
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Figure 3 Location of highly variable loop region in the 3CL pro of HKU1 (3D23). Colouring of backbone elements by B-factor indicates the 
presence of a highly variable loop from residues Ser46 to Asp53. This variable loop region may be responsible for the spike in DOPE score for 
3D23, for which OC43 3CL pro has assumed a more stable conformation. 


that the receptor in its active conformation. To ensure the 
modelled 3CL pro of OC43 is in its active conformation, 
several features can be used to differentiate between the 
active and inactive states. These features include the main¬ 
tenance of a loop conformation of the loop connecting 
domain II and III and the oxyanion loop. If these features 
assume an alpha helical conformation it directly or indir¬ 
ectly leads to the collapse of the oxyanion hole and 
thereby forms an inactive state [26]. An alpha helix con¬ 
formation of residues 186-190 in the loop region between 
domain II and III causes a further collapse of the S 2 and 


S 4 subsites [19]. The orientation of the Hisl63 sidechain 
in the Si pocket is also vital for substrate binding [27]. 
This residue is largely kept in place by a stacking inter¬ 
action with the benzene ring of Phel40 and imidazole ring 
of Hisl63 [26]. The orientation of Hisl72 is also stabilized 
by a hydrogen bond with the sidechain of Glul66. This 
prevents steric interactions between Hisl72 and Hisl63 
allowing it to maintain its orientation in the Si pocket. A 
hydrogen bond between Hisl63 and Tyrl61 has also been 
implicated in stabilizing the sidechain of Hisl63 [27], 
however we could not observe the formation of this bond 


r 



Figure 4 Superimposition of 3D23 and homology model of OC43 3CL pro Blue ribbon represents the template structure of 3D23 with green 
representing the homology model of OC43 3CL pro Superimposition shows a high degree of structural homology with a low RMSD of 0.327 A. 
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Figure 5 Overall quality of the model as assessed by Z-score. (a) Z-score for crystallographic model of 3D23. (b) Z-score for the homology model 
of OC43 3CL pro A strong correlation between template and target structures indicates an accurate model. 
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Figure 6 Stereochemical analysis of phi-psi dihedral angle of the OC43 3CL pro . Ramachandran plot generated by PROCHECK indicates that 91.8% 
of residues are located in the most favoured regions with none in the disallowed regions. Based on analysis of 118 solved structures, a good 
model is expected to have above 90% of residues in regions A, B and I [35]. 
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in the crystal structure of SARS-CoV 3CL pro (3V3M), rais¬ 
ing the question of its importance to maintain the orienta¬ 
tion of Hisl62/3. 

Utilising these previously mentioned parameters it is 
possible to ascertain if the generated homology model of 
OC43 3CL pro is in an active conformation. With the ex¬ 
ception of the hydrogen bond between Tyrl61 and 
His 163, which may form in a dynamic system as the 
Tyrl61 hydroxyl is in close proximity to the imidazole 
nitrogen of His 163 but with incorrect geometry to form 
a hydrogen bond, all other interactions and conforma¬ 
tions were maintained indicating that the homology 
model generated using the 3CL pro structure of HKU1 


(3D23) as a template is representative of the active con¬ 
formation of the enzyme (Figure 7). With this and previ¬ 
ously mentioned results, indicating the generation of an 
appropriate homology model of the OC43 3CL pro , suggests 
the generated structure is suitable for further structure- 
based drug design techniques. 

Assessing the binding conformation and free energy of 
binding of 16R 

Analysis of the crystallographic structure of the SARS- 
CoV 3CL pro , with the bound inhibitor 16R, illustrates a 
pyridine occupying the Si pocket and forming a hydro¬ 
gen bond with the imidazole Ne2 of Hisl63. A further 



Hisl63 X 


Phel40 
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h.^3 


Gtul66 


Domain I 
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Figure 7 Features present in the homology model of OC43 3CL pro which represent of an active state of the enzyme, (a) Orientation of Hisl 63 is 
essential for substrate binding. This orientation is maintained by stacking interactions with Phe140 and a hydrogen bond with Tyr161.The 
importance of this bond is questionable as it is not observed in all crystallographic models, (b) Steric interactions between Hisl 72 and Hisl 63 
disrupt the active conformation of Hisl 63. To prevent this, Hisl 72 is stabilized by a hydrogen bond with Glu 166. (c) Maintenance of loop 
structures of the oxyanion loop (blue) and the loop connecting domain II and III (red) are essential in stabilizing the oxyanion hole. The general 
three domain structure of all 3CL pro is also depicted. 
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two hydrogen bonds are formed between ligand car¬ 
bonyls and Glyl43 and Glul66 with three methyl groups 
inserting into the deep hydrophobic S 2 pocket [12]. The 
conformation of 16R is almost completely conserved 
when bound to the 3CL pro of OC43. Notable exceptions 
include the shifting of the hydrogen bond formed with 
Glyl43 to a furan ring oxygen as opposed to the car¬ 
bonyl oxygen seen in SARS-CoV 3CL pro . The hydrogen 
bond formed with Glul66 is also absent in OC43 
3CL pro , however the distance between the ligand car- 

o 

bonyl and backbone hydrogen is 2.75 A and therefore 
this may be capable of mediating hydrogen bond forma¬ 
tion in a dynamic system (Figure 8). Free energy of bind¬ 
ing between 16R and SARS-CoV and OC43 3CL pro , as 
assessed by MM-GBSA, is -85 kcal/mol for both recep¬ 
tors. These results suggest that the inhibitor 16R may be 
capable of inhibiting both complexes. 

Molecular dynamic simulation of OC43 and 16R 

To further confirm the findings described in the previ¬ 
ous section, and to assess the potential of Glul66 to 
form a hydrogen bond in a dynamic system, molecular 
dynamic simulations were used. Molecular dynamics of 
the SARS-CoV 3CL pro -16R complex are in strong agree¬ 
ment with the crystal data of 3V3M [12]. The simulation 
predicts an average of 3.17 hydrogen bonds formed over 
the 10 ns trajectory, with these bonds predominantly 
forming with Glul66, Hisl63 and Glyl43. The bond 
formed at Glyl43 does however display the potential to 
form a bifurcated interaction with Asnl42. As predicted 
by the previous results, detailed in the section describing 
the binding conformation, the 3CL pro -16R complex of 
OC43 displayed a high occupancy of 100% and 85% at 
both His 163 and Glyl43 positions respectively. These re¬ 
sults however suggested that the bond formed with 
Glyl43 was mediated via the furan ring oxygen, where 
molecular dynamic simulations indicated that this bond 


was formed between the carbonyl oxygen and Glyl43, as 
seen with the crystal structure of SARS-CoV. Molecular 
dynamics also confirmed that the hydrogen bond at the 
Glul66 does not stably form in a dynamic environment 
and only displays the potential to form at 0-2 ns and 4- 
5 ns, with an overall occupancy of 21.62%. 

The stability of the bound ligand within the active 
pocket was also assessed via RMSD, radius of gyration 
and changes in solvent accessible surface area. A RMS 
deviation below 2 A is an indication of a ligand which is 
stably bound to its receptor [28], where an increase in 
radius of gyration from the starting reference would sug¬ 
gest ligand instability. An increase in solvent accessible 
surface area of the ligand may indicate the ligand is dis¬ 
sociating into surrounding solvent [29]. Ligand RMSD 

o 

within the active site was approximately 0.75 A and ra¬ 
dius of gyration and solvent accessible surface area 

o o 

remained within a stable range of 3.9 A to 4.0 A and 

r\ r\ 

4.8 nm to 5.4 nm respectively (Figure 9). These param¬ 
eters suggest that the ligand is highly stable within the 
active site. 

Conclusion 

The three-dimensional structure of the OC43 3CL pro 
was successfully solved by homology modelling with 
MODELLER. Analysis of various side chains and loop 
conformations within and surrounding the substrate¬ 
binding site is indicative of an active conformation of 
the enzyme. The solved structure has been made publicly 
available in the Protein Model Database (PMDB; https:// 
bioinformatics.cineca.it/PMDB/) [PMDB ID: PM0079872]. 
Further analysis of a previously identified lead compound, 
16R, suggests that it makes extensive and stable interac¬ 
tions with the 3CL pro binding site of OC43. Additional 
analysis of the free energy of binding by MM-GBSA 
suggests that the ligand binds with the same affinity to 
the 3CL pro of OC43 as it does to SARS-CoV, a receptor 
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Figure 8 Binding conformation of 16R within the active pocket of the 3CL pro (a) Pose of 16R within the active site of the SARS-CoV 3CL pro . 
Hydrogen bonds (yellow dashes) are formed with Gly143, Hisl 63 and Glu 166. (b) 16R assumes a similar pose when bound to the OC43 3CLpro. A 
notable difference is the loss a hydrogen bond with Glu 166. Hydrogen bonds with Gly 143 and Hisl 63 are however maintained. 
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Figure 9 RMSD and solvent accessible surface area of 16R when bound to the OC43 3CL pro Analysis of ligand RMSD and solvent accessible 
surface area depicts a ligand which is stably bound to its receptor. 

\___ J 


it is experimentally proven to inhibit. These results are 
strongly suggestive that the lead compound, 16R, not 
only displays the potential to inhibit the 3CL pro of 
SARS-CoV but also that of the highly homologous OC43. 
In additional studies (data not shown) 16R was shown to 
remain stably bound to the 3CL pro of 229E and NL63 in 
molecular dynamic simulations and may therefore even 
display the potential to inhibit the 3CL pro of additional 
coronaviruses. 16R is therefore an excellent lead com¬ 
pound in the pursuit of true broad spectrum inhibitors of 
all coronaviruses. 

Methods 

Homology modelling of the OC43 3CL pro 

The amino acid sequence of the OC43 3CL pro was ob¬ 
tained from the NCBI database [GenBank: AEN19363]. 
Fold assignment with the profile.build module of MOD¬ 
ELLER (9vl0) identified the most suitable template 
structure from PDB with an 82.3% sequence identity and 
e-value of 0. Template-target alignment was conducted 
with the align.2d module, which also incorporates struc¬ 
tural information from the template. Models of the 
OC43 3CL pro were built by satisfaction of homology de¬ 
rived spatial restraints on distances and dihedral angles 
[23], stereochemical restraints obtained from the 
CHARMM22 force field [30] and statistical preferences 
on dihedral angles and non-bonded distances obtained 
from known protein structures [22,31]. Models were 
built by the automodel class by minimizing the viola¬ 
tions on all restraints and united-atom PDB models were 
generated. 

DOPE scores are assigned by considering the positions 
of all non-hydrogen atoms, where the lowest DOPE 
score corresponds to the model that is predicted to be 
most accurate [31]. The evaluate_model.py script was 
further utilised to calculate per residue DOPE scores, 
which were superimposed over the template structure 
DOPE scores. A general conservation in the pattern of 
respective DOPE scores was used as an indication of an 


accurate model. A low RMSD, as assessed by PyMOL 
[32], indicated structural homology between solved 
structure and template. The GA341 score was used to 
analyse fold-assignment. Z-scores, obtained from ProSA- 
web [33,34] were used to analyse overall model quality 
and to assure template and query models were in a 
range acceptable for structures of their respective size. A 
Z-score outside a range characteristic for native proteins 
will suggest the production of an erroneous model [33]. 
Stereochemical analysis of dihedral angles was then fur¬ 
ther conducted using PROCHECK [35,36] on models 
corresponding to the lowest DOPE score with GA341 
scores amounting to 1.0. Stereochemical analysis by 
PROCHECK assessed residue geometry, where more 
than 90% of residues should be located in the “most 
favoured region” to indicate a stereochemically favourable 
structure. 

Once an appropriate model accuracy was achieved 
based on these previous observations, the active site of 
the enzyme was analysed to ensure it was in its active 
state. Several features are able to differentiate an active 
from inactive state of the 3CL pro . These include the con¬ 
formation of the oxyanion loop and loop connecting do¬ 
main II and III [19,26] and the sidechain orientation of 
residues, including Hisl63, Phel40, Hisl72, Tyrl61 and 
Glul66, within the Si pocket [26,27]. 

Assessing the binding conformation and free energy of 
binding of 16R 

Protein structures of 3V3M (3CL pro of SARS-CoV) and 
the homology modelling derived structure of OC43 
3CL pro were prepared with the Schrodinger Protein 
Preparation Wizard. Histidine protonation was assigned 
to the epsilon nitrogen in accordance with crystal data. 
A restrained minimization was then performed using the 
OPLS2005 force field [37,38], during which heavy atoms 

o 

were restrained to remain within 0.3 A of the original 
structure. To assess the conformation of the known in¬ 
hibitor, 16R, when binding to the OC43 3CL pro , the 





Berry et al. BMC Structural Biology (2015) 15:8 


Page 9 of 10 


experimental structure of 3V3M and OC43 3CL pro were 
aligned in Maestro, the interface for all Schrodingers 
software. The inhibitor, 16R, was subsequently superim¬ 
posed over the OC43 3CL pro active site using coordi¬ 
nates present in the 3V3M crystal structure. Side chains 
of residues in the active pocket assuming orientations 
that resulted in a steric clash with the ligand were re¬ 
fined and the ligand was minimized with Prime [39,40] 
to better fit the pocket. Prime utilizes both empirically 
and theoretically derived constraints to predict and rep¬ 
licate the dynamic motion of protein sidechains, allow¬ 
ing for conformational changes that arise through the 
“induced-fit” phenomenon, which is often neglected in 
molecular docking screens. 

The Prime/MM-GBSA method was used to calculate 
the binding free energy of 16R within the respective re¬ 
ceptor. The free energy of binding is the calculated dif¬ 
ference between the minimized receptor-inhibitor 
complex and the sum of the energies of the minimized 
unbound inhibitor and receptor. Ligand poses were min¬ 
imized using the local optimization feature in Prime, 
where energies of the complex were calculated with the 
OPLS-2005 force field and GBSA continuum solvent 
model. 

Molecular dynamic simulation of the 3CL pro -16R complex 

The CHARMM27 all atom force field was assigned to 
receptor structures using the three point TIP3P water 
model. All input hydrogens in the coordinate file were 
ignored and were assigned according to the force field. 
Histidine protonation states were assigned to the epsilon 
nitrogen in the neutral form in accordance with crystal¬ 
lographic data. The ligand, in the previously predicted 
pose, was prepared by SwisParam [41]. The protein- 
ligand complex was then placed in the centre of a sol¬ 
vated, cubic box, 1 nm from the box edge. The system 
was solvated with the spc216 water model and sodium 
counter ions were added to neutralize the system. The 
system was energy minimized by a brief dynamics simu¬ 
lation using Steepest Descent Minimization until stee¬ 
pest descent converged to Fmax < 1000. Long range 
electrostatic interactions were calculated with the Par¬ 
ticle Mesh Ewald (PME) method [42] with a cutoff of 
1 nm and periodic boundary conditions were applied. 
Both NVT and NPT equilibrations were run for 50 000 
steps or 100 ps using a 2 fs time step and a leap-frog in¬ 
tegrator. All bonds were constrained by the lines algo¬ 
rithm. PME was again used for long range electrostatics 
with a 0.16 fourier spacing. Short range electrostatic and 
van der Waals cutoffs were 1.0 nm and 1.4 nm respect¬ 
ively. Coordinates, velocities and energies were saved 
every 0.2 ps. During NVT equilibration temperature 
coupling was achieved by V-rescale algorithm with a tar¬ 
get temperature of 300 K. Pressure coupling during NPT 


equilibration was achieved via the Parrinello-Rahman al¬ 
gorithm and is incorporated in the NPT equilibration 
once temperature is stable to ensure the proper density 
is reached (approximately 1000 kg/m ). Following NVT 
and NPT system equilibration an unrestrained, 10 ns 
simulation (5 000 000 steps) was run with the leap frog 
integrator, saving coordinates, velocities and energies 
every 2 ps. All other parameters were maintained from the 
NVT and NPT equilibration including both temperature 
and pressure coupling. 
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